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Abstract. In the simple mean-field SIS and SIR epidemic models, infection is transmitted from 
infectious to susceptible members of a finite population by independent p— coin tosses. Spatial 
variants of these models are considered, in which finite populations of size TV are situated at the 
sites of a lattice and infectious contacts are limited to individuals at neighboring sites. Scaling laws 
for these models are given when the infection parameter p is such that the epidemics are critical. It 
is shown that in all cases there is a critical threshold for the numbers initially infected: below the 
threshold, the epidemic evolves in essentially the same manner as its branching envelope, but at the 
threshold evolves like a branching process with a size-dependent drift. The corresponding scaling 
limits are super-Brownian motions and Dawson- Watanabe processes with killing, respectively. 



1. Introduction 

1.1. Critical mean- field epidemics: threshold behavior. It was discovered bv Martin-Lof [18] 
and independently by Aldous jTj that the simple mean-field SIR epidemic, also known as the Reed- 
Frost epidemic, exhibits a curious threshold behavior at criticality. Roughly, if Un is the size - that 
is, the number of individuals ever infected - of the epidemic in a population of size A'^, then Un has 
a markedly different asymptotic distribution when the number Jq of individuals infected at time 
is of order N^^^ than when it is of order o{N^^^). In particular, if Jq ~ bN" as iV — oo, then 

(1) Un/n'" n, 

where is the first passage time to the level 6 by a standard Wiener Process, if a < 1/3, or by 
a Wiener process with time-dependent drift t, if a = 1/3. This reflects the fact that the size of 
the largest connected component in a critical {p — l/N) Erdos-Renyi random graph on N vertices 
is of order N^^^. R. Dolgoarshinnykh and the author have observed that there is a similar 
critical threshold effect for the simple mean-field SIS epidemic, but here the threshold for Jq is at 
A^^/^ rather than N^^^, and the limit distribution at the threshold involves first passage times by 
Ornstein-Uhlenbeck processes. In fact, there is an asymptotic form for the entire evolution of the 
epidemic at criticality that undergoes a discontinuity at Jq ~ N^^"^: If J„ denotes the number of 
individuals infected at time n then 

(2) iV-" J[jV"t] ^ Yt 

where Yq = b and Yt is either a Feller diffusion or a Feller diffusion with location-dependent drift 
—Y^ dt, depending on whether a < 1/2 or a = 1/2, that is, 

(3) dYt^^tdWt ffa<l/2; 

dYt = -Y1 dt + ^/YtdWt if a /2. 
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Note that in the case a < 1/2 the hmit process - the Feller diffusion - is the same as the limit 
process for the rescaled critical Galton- Watson process. There is a similar process-level threshold 
effect for the Reed- Frost epidemic at Jq « N^^"^ - see [9]. 

1.2. Spatial SIS and SIR epidemics. The purpose of this article is to show that there is a 
similar critical threshold effect for spatial epidemics in one spatial dimension - see Theorem [T] below. 
The epidemic models considered are simple discrete-time spatial analogues of the Reed-Frost and 
stochastic logistic epidemics. These are chosen primarily to streamline the mathematical analysis; 
however, analogous effects should also be expected for more complex models in dimension d — 1. 
A secondary motivation for the specification of the spatial SIR model is that it has a percolation 
(random graph) description similar to the Erdos-Renyi random graph description of the Reed-Frost 
epidemic, and so our main result can be interpreted as a statement about the connected clusters in 
certain percolation models. 

The spatial epidemics, which we will henceforth call the SIR—d and SIS—d epidemics, are defined 
as follows: Assume that at each lattice point x G 'E'^ is a. homogeneous population ("village") of 
N individuals, each of whom may at any time be either susceptible or infected, or (in the SIR 
variants) recovered. As in the corresponding mean-field models (see |18|), infected individuals remain 
infected for one unit of time, and then recover; in the SIR-d epidemic, infected individuals recover 
and are thereafter immune from infection, while in the SIS-d model, infected individuals, upon 
recovery, become once again susceptible to infection. The rules of infection are the same as for 
the corresponding mean-field models, except that the infection rates depend on the locations of the 
infected and susceptible individuals. Thus, at each time t = 0, 1, 2, . . . , for each pair {i^, Sy) of an 
infected individual located at x and a susceptible individual at y, the disease spreads from to 
Sy with probability pj\r(x,y). We shall only consider the case where the transmission probabilities 
Pn{x, y) are spatially homogeneous, nearest-neighbor, and symmetric, and scale with the village size 
N in such a way that the expected number of infections by a contagious individual in an otherwise 
healthy population is 1 (so that the epidemic is critical), that is. 

Assumption 1. 

(4) pn{x,x + Ci) — Cd/N if |ei| = lorO, where 

(5) Cd = l/{2d+l). 

Similar models incorporating separated clusters have been studied by Schinazi [21 , Belhadji 
& Lanchier [3], and others, but these studies have focused on SIS variants of the models where 
all infected individuals in a colony recover simultaneously, and where infection rates within and 
between colonies vary. Critical behavior of certain spatial epidemic models has been addressed in 
the literature, in particular for long-range contact processes |20j . |12j . which are in certain respects 
similar to the SIS-d model described above; however, critical behavior of spatial SIR models has 
not been previously studied. For surveys of contact models in spatial epidemics, see [12] and [TD]. 

Interest in spatial epidemic models has largely been focused on dimensions d > 2, and especially 
rf = 2, for natural reasons. Nevertheless, nearest neighbor infection models in dimension d — 1 
may be of interest in certain contexts: Many plant and animal species live in river valleys or along 
shorelines, and for these the natural dimension for spatial interactions is d = 1. 

1.3. Epidemic Models and Random Graphs. The models described above have equivalent de- 
scriptions as structured random graphs, that is, percolation processes. Consider first the simple SIR 
(Reed-Frost) epidemic. In this model, no individual may be infected more than once; furthermore, 
for any pair x, y of individuals, there will be at most one opportunity for infection to pass from x to 
y or from y to x during the course of the epidemic. Thus, one could simulate the epidemic by first 
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tossing a p— coin for every pair x^y, drawing an edge between x and y for each coin toss resulting 
in a Head, and then using the resulting (Erdos-Renyi) random graph determined by these edges to 
determine the course of infection in the epidemic. In detail: If Iq is the set of infected individuals at 
time 0, then the set Yi of individuals infected at time 1 consists of all a; ^ Yq that are connected to 
individuals in Y^, and for any subsequent time n, the set Yn+i of individuals infected at time n + 1 
consists of all x ^ Uj<„Yj who are connected to individuals in Yn- Note that the set of individuals 
ultimately infected during the course of the epidemic is the union of those connected components of 
the random graph containing at least one vertex in Yq- 

Similar random graph descriptions may be given for the mean-field SIS and the spatial SIS and 
SIR epidemic models. Consider for definiteness the SIR-d epidemic. To simulate this, first build 
a random graph by Bernoulli bond percolation on the graph Kjv x Z'', where K^v is the complete 
graph on N vertices. Given the random graph, simulate the generations Y„ of the SIR-d epidemic 
by the same rule as in the mean-field case: For each generation n, define the set Y„+i of individuals 
infected at time n-\- \ to be the set of all vertices x ^ Uj<„Yj who are connected to individuals in Y„. 
Similar random graph descriptions may be given for SIS epidemics, but using oriented percolation 
for the random graphs. 

1.4. Branching envelopes of spatial epidemics. The branching envelope of a spatial SIS—d 
or SIR—d epidemic is a branching random walk on the integer lattice Z''. This evolves as follows: 
Any particle located at site x at time t lives for one unit of time and then reproduces, placing 
random numbers S^y of offspring at the sites y such that — < 1. The random variables S^y are 
mutually independent, each with Binomial-(iV, Cd/N) distributions, where Cd = l/{2d+ 1). Denote 
this reproduction rule by TZn, and denote by TZao the corresponding offspring law in which the 
Binomial distribution is replaced by the Poisson distribution with mean Cd- Note that for each of 
the offspring distributions TZn, the branching random walk is critical, that is, the expected total 
number of offspring of a particle is 1. 

A fundamental theorem of S. Watanabe [24] asserts that, under suitable rescaling (the Feller 
scaling) the measure-valued processes naturally associated with critical branching random walks 
converge to a limit, the standard Dawson-Watanabe process, also known as super Brownian motion. 

Definition 1. The Feller- Watanabe scaling operator scales mass by 1/fc and space by \/\/k^ 
that is, for any finite Borel measure on and any test function 4>{x), 

(6) {(f>,!FklJ') = J (f>{Vkx) ^(dx) 

Watanabe's Theorem . Fix N , and for each k = 1,2, . . . let Y^ be a branching random walk with 
offspring distribution TZm and initial particle configuration Y^ . (In particular, Yf''^{x) denotes the 
number of particles at site x E Z in generation [t] , and Y^ is the corresponding Borel measure on 
M.^ If the initial mass distributions converge, after rescaling, as k oo, that is, if 

(7) TkYo^ => Xo 

for some finite Borel measure Xq on M'', then the rescaled measure-valued processes J^kYkt converge 
in law as k ^ oo: 

(8) {J^kY'')kt ^ Xt. 

The limit is the standard Dawson- Watanabe process Xt (also known as super- Brownian motion) . 
See |13j for more. In dimension d = 1 the random measure Xt is for each t absolutely continuous 
relative to Lebesgue measure [T7], and the Radon-Nikodym derivative X{t,x) is jointly continuous 
in t,x (for t > 0). In dimensions d>2 the measure Xt is almost surely singular, and is supported 
by a Borel set of Hausdorff dimension 2 [7j . 
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1.5. Scaling limits of critical spatial epidemics. Our main result, Theorem [T] below, asserts 
that after appropriate rescaling the SIS -1 and SIR -1 spatial epidemics converge weakly as — s- oo. 
The limit processes are either standard Dawson- Watanabe processes or Dawson- Watanabe processes 
with variable-rate killing, depending on the initial configuration of infected individuals. The Dawson- 
Watanabe process Xt with killing rate 9 = d{x,t,uj) (assumed to be progressively measurable and 
jointly continuous in t, x) and variance parameter can be characterized by a martingale problem 
(0, sec. 6.2): for every cj) G C^(R'^), 

nt pt 

(9) (X4,0)-(Xo,0)-^ / (X„A0)ds+ / {Xs,e{-,s)4>)ds 

^ Jo Jo 

is a martingale with the same quadratic variation as for the standard Dawson- Watanabe process. 
Existence and distributional uniqueness of such processes in general is asserted in [8 and proved, 
in various cases, in [6] and [T4j. It is also proved in these articles that the law of a Dawson- 
Watanabe process with killing on a finite time interval is absolutely continuous with respect to that 
of a standard Dawson- Watanabe process with the same variance parameter, and that the likelihood 
ratio (Radon- Nikodym derivative) is [S] 

(10) / ""^'IJ '^'^ '^^ 

where dM{t,x) is the orthogonal martingale measure attached to the standard Dawson- Watanabe 
process (see j23j and sec. 12.51 below). Absolute continuity implies that sample path properties 
are inherited: In particular, if Xt is a one-dimensional Dawson- Watanabe process with killing, 
then almost surely the random measure Xt is absolutely continuous, with density X{x,t) jointly 
continuous in x and t. 

Theorem 1. Let Yt^ (x) be the number infected at time t and site x in a critical SIS-1 or SIR-1 
epidemic with village size N and initial configuration Yfj^{x). Fix a > 0, and let X'^{t,x) be the 
renormalized particle density function process obtained by linear interpolation in x from the values 

(11) X'^{t,x):^ for xeZ/Vn^. 



Assume that there is a compact interval J such that the initial particle density functions X^ {Q,x) 
all have support contained in J, and assume that the functions X^{Q^x) converge in Cf,(R) to a 
function X(0,x). Then under Assumptions^ as N oo, 

(12) X^{t,x) X{t,x) 

where X(t, x) is the density of a Dawson- Watanabe process Xt with initial density X{0, x) and killing 
rate 9 depending on the value of a and the type of epidemic (SIS or SIR) as follows: 

2 



(a) SIS: Ifa<\ then 9{x,t) = 0. 



(b) SIS: //a f then 



3 
2 

3 

(13) e{x,t) ^ X{x,t)/2. 

(c) SIR: Ifa<l then then 9{x,t) = 0. 

(d) SIR: Ifa=l then 

(14) 9{x,t) = I X{x,s)ds. 

Jo 

The convergence in (|12p is weak convergence relative to the Skorohod topology on the space 
D([U, cxd), Cb(R)) of cadlag functions 1 1~* X(t, x) valued in Cb{S.) . 
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The proof of Theorem [T] is given in section [21 it will will depend on Theorem [5] below. In both 
Theorems [5] and m the assumption that the initial particle densities have common compact support J 
can undoubtedly be weakened; however, this assumption eliminates certain technical complications 
in the arguments (see equation ((27)) in sec. [2]). 

Remarks. (A) The case of principal interest is the SIR-1 epidemic. The SIS-1 epidemic is closely 
related to the long-range contact process studied by Mueller and Tribe [20] in d = 1 and by Durrett 
and Perkins [1^ in d > 2, and in particular the limit process for the SIS-1 epidemic, the Dawson- 
Watanabe process with killing rate (fT3|) . is the same as that for the rcscalcd contact process in d = 1. 
Nevertheless, the rescaled contact process and the discrete-time SIS-1 epidemic studied here differ 
in some important technical respects, so parts (a)- (b) of Theorem [T] do not follow from the results 
of [20]. 

(B) The proof strategy here (sec. [5] below) is considerably different - and simpler - than the 
usual approach, based on martingale methods, taken in the literature of weak convergence to su- 
perprocesses, such as in (30], [H], [5], and [TT]. Although martingale methods might be made to 
work here, the prospect of using them in connection with processes such as the SIR-1 epidemic with 
history-dependent rates is rather daunting. Instead, we rely on the fact that the laws of both the 
SIS and SIR epidemics are absolutely continuous with respect to those of their branching envelopes, 
for which scaling limits are already known. The Radon-Nikodym derivatives have tractable forms, 
as exponentials of certain stochastic integrals. These will be shown to converge to the corresponding 
Radon-Nikodym derivatives HU]). The advantage of this strategy is that, given Theorem [21 there is 
no need to check tightness for the rescaled epidemic processes. Moreover, there is virtually no addi- 
tional work involved in establishing the result for the SIS model - all that is needed is an additional 
simple asymptotic estimation of the Radon-Nikodym derivatives. 

(C) In higher dimensions, there is no analogous threshold effect at criticality. In dimensions 
d > 3, a branching random walk started by any number of particles will quickly diffuse, so that after 
a short time most occupied sites will have only 0(1) particles. Consequently, in both the SIS-d and 
the SIR- d epidemics, the effect of finite population size on the production of new infections will 
be limited, and so both epidemics will behave in more or less the same manner as their branching 
envelopes. In dimension d = 2, the situation is somewhat more interesting: it appears that here 
finite population size will manifest itself by a logarithmic drag on the production of new particles. 
The cases d > 2 will be discussed in detail in a forthcoming paper of Xinghua Zheng [2^ . 

1.6. Heuristics: The standard coupling. The critical thresholds for the SIS~d and SIR—d 
epidemics can be guessed by a simple comparison argument based on the standard coupling of the 
epidemic and its branching envelope. For the SIS—d epidemic, the coupling is constructed as follows: 
Build a branching random walk whose initial state coincides with that of the epidemic, with particles 
to be colored red or blue according to whether or not they represent infections that actually take place 
(red particles represent actual infections). Initially, all particles are red. At each time t — 0,1,2, ... , 
particles of the branching random walk produce offspring at neighboring sites according to the law 
described in sec. ll.4l above. Offspring of blue particles are always blue, but offspring of red particles 
may be either red or blue, with the choices made as follows: All offspring of red particles at a location 
y choose labels j G [N] at random, independently of all other particles; for any label j chosen by 
k > 1 particles, one of the particles is chosen at random and colored red, and the remaining k ~ 1 
particles are colored blue. The population of all particles evolves as a branching random walk, by 
construction, while the subpopulation of red particles evolves as an SIS—d epidemic. Observe that 
the branching random walk dominates the epidemic: thus, the duration, size, and spatial extent of 
the epidemic are limited by those of the branching envelope. 
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The standard coupling of an SIR—d epidemic with its branching envelope is constructed in a 

similar fashion, but with the following rule governing choices of color by offspring of red particles: 
All offspring of red particles at a location y choose numbers j S [N] at random, independently of 
all other particles. If a particle chooses a number j that was previously chosen by a particle of an 
earlier generation at the same site y, then it is assigned color blue. If fc > 1 offspring of red particles 
choose the same number j at the same time, and if j was not chosen in an earlier generation, then 1 
of the particles is assigned color red, while the remaining fc — 1 are assigned color blue. Under this 
rule, the subpopulation of red particles evolves as an SIR—d epidemic. 

In both couplings, the production of blue offspring by red particles may be viewed as an attrition 
of the red population. Assume that initially there are N°' particles; then by Feller's limit theorem 
for critical Galton- Watson processes, the branching envelope can be expected to survive for Op (TV") 
generations, and at any time prior to extinction the population will have Op{N°') members. These 
will be distributed among the sites at distance Op{N"/'^) from the origin, and therefore in dimension 
d = 1 there should be about Op{N°'f'^) particles per site. Consequently, for the SIS—1 epidemic, 
the rate of attrition per site per generation should be Op{N°'~^), and so the total attrition rate per 
generation should be Op(7V^"/^~^). If a = 2/3, then the total attrition rate per generation will be 
Op(l), just enough so that the total attrition through the duration of the branching random walk 
envelope will be on the same order of magnitude as the population size N°'. 

For the SIR—1 epidemic there is a similar heuristic calculation. As for the SIS—1 epidemic, 
the branching envelope will survive for Op{N°') generations, and up to the time of extinction the 
population should have Op{N") individuals, about Op{N'^^^) per site. Therefore, through N"' 
generations, about A''" x A'"/^ numbers j will be retired, and so the attrition rate per site per 
generation should be Op{N°'/'^ x A"^"/^), making the total attrition rate per generation Op{N^"^'^). 
Hence, if a = 2/5 then the total attrition per generation should be Op(l), just enough so that the 
total attrition through the duration of the branching random walk envelope will be on the same 
order of magnitude as the population size. 

1.7. Weak convergence in D([0, oc), C6(M)). The heuristic argument above has an obvious gap: 
it relics crucially on the assertion that the particles of a critical branching random walk distribute 
themselves somewhat uniformly, at least locally, among the sites at distance A^"/^ from the origin. 
The fact that the Dawson- Watanabe process in dimension one has a continuous density suggests that 
this should be true, but does not imply it. Following is a strengthening of the Watanabe theorem 
suitable for our purposes. 

Denote by Cb{^) the space of continuous, bounded, real- valued functions on M with the sup-norm 
topology, and by D([0, oo), Cb(]R)) the Skorohod space of cadlag fimctions X{t,x) valued in Cb(R) 
(thus, for each t>0 the function X{t, x) is a continuous, bounded function of x). Fix N < oo, and 
for fc = 1,2, . . . let Yf''{x) be the number of particles at site x at time [t] in a branching random 
walk with offspring distribution TZn and initial particle configuration Y^{-). Let X^{t,x) be the 
renormalized density function: that is, the function obtained by linear interpolation (in x) from the 
values 



Theorem 2. Assume that there is a compact interval J C M such that all of the the initial particle 
densities X'^(0, •) have support C J, and assume that as k ^ oo the functions X''{0,-) converge in 
Cb(E) to a continuous function A(0, •). Then as fc oo, 



(15) 



X^{t,x) = 



Y^4Vkx)_ 



for x e Ij/Vk. 



(16) 
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where X{t,x) is the density function of a Dawson-Watanabe process with initial density X(0,x), 
and indicates weak convergence relative to the Skorohod topology on D([0, cx)), Ch(K)). 

To prove this, it suffices to show that the sequence X'^it, x) of densities is tight in B([0, oo), C6(R)), 
because Watanabe's theorem imphes that any weak Hmit of a subsequence must be a density of the 
Dawson-Watanabe process. The proof of tightness, carried out in section [3] below, will be based on 
a form of the Kolmogorov-Chentsov tightness criterion and a moment calculation. This proof will 
use only three properties of the offspring law TZn- 

(a) the mean number of offspring is 1; 

(b) the total number of offspring has finite mth moment, for each m > 1; and 

(c) offspring choose locations at random from among the neighboring sites. 

Since the mth moments are bounded uniformly over the class of Binomial-(A^, 1/iV) distributions, 
it will follow that tightness holds simultaneously for all of the offspring laws TZn- Only the case 
N — oo will be needed for the analysis of the spatial epidemic models, however. 

Remark. Mueller and Tribe [20] proved that the density processes of rescaled long-range contact 
processes in one dimension converge weakly to the density process of a Dawson-Watanabe process 
with killing, but make no explicit use of branching random walks in their argument. Nevertheless, 
they likely were aware that the density processes of rescaled branching random walks would also 
converge weakly in one dimension. 

1.8. Spatial extent of the Dawson-Watanabe process. An object of natural interest in con- 
nection with the spatial SIS and SIR epidemics is the spatial extent of the process, that is, the area 
reached by the infection. Under certain natural restrictions on the initial configuration of infected 
individuals, the spatial extent will, by Theorem [l] be well-approximated in law by the area covered 
by the limiting Dawson-Watanabe process, after suitable scaling. For Dawson-Watanabe processes 
Xt with location-dependent killing, the distribution of Ti-{X) := UtSupport(Xt) likely cannot be 
described in closed form. However, for the Dawson-Watanabe process with constant killing rate, the 
distribution of TZ{X) can be given in a computable form, as we now show. 

Proposition 1. Let Xt be the standard one- dimensional Dawson-Watanabe process with variance 
parameter ~ 1. For any finite Borel measure /i with support contained in the interval D = (0, a), 

(17) ^logP{n{X)(iD\Xo = fi) = J pL{x/V6)fi{dx) 
where 

(18) PLix)= \ a( 

is the Weierstrass p~ function with period lattice L generated by \f%ae"'^l^ . 
Proof. By a theorem of Dynkin [13 , ch. 8, the function 

(19) ud{x):^-- log P{n{X)(lD\Xo^S,) 

is the unique solution of the differential equation u" — in D with boundary conditions u{x) —> oo 
as a; ^ 0,a. Set v = u'; then the equation u" — becomes v' = u^, and so dv/du — u^/v. 
Integration gives 

= u^/3 + C, 

which, up to constants, is the differential equation of the p— function [5S], ch. 20. The result P?|) 
now follows for the special case ^ — 5x. The general case now follows by the superposition principle 
for branching processes: in particular, the Dawson-Watanabe process Xt with initial condition 
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Xq = II + V can be decomposed as the union of independent Dawson- Watanabe processes with 
initial conditions /i, v, respectively, and so the quantity on the left side of (|17p is linear in /i. It 
follows that p?|) holds for all initial measures /i with support contained va D. □ 



2. Spatial Epidemic Models: Proof of Theorem [T] 

2.1. Strategy. We shall exploit the fact that the laws of the spatial SIS and SIR epidemics are 
absolutely continuous with respect to the laws of their branching envelopes. Because the branching 
envelopes converge weakly, after rescaling, to super-Brownian motions, by Watanabe's theorem (and 
in the stronger sense of Theorem [2]), to prove the weak convergence of the rescaled spatial epidemics 
it will suffice to show that the likelihood ratios converge, in a suitable sense, to the likelihood ratios 
of the appropriate Dawson- Watanabe processes relative to super-Brownian motion: 

Proposition 2. Let Xn, X be random variables valued in a metric space X , all defined on a common 
probability space (yi,T,P), and let Ln,L be nonnegative, real-valued random variables on {Vl,T,P) 
such that 

(20) EpLn = EpL = 1 V n. 

Let Qn,Q be the probability measures on {fl,J-) with likelihood ratios Ln, L relative to P. If 

(21) (X„,L„) =^ (X,i) 

under P as n oo, then the Q„— distribution of Xn converges to the Q— distribution of X , that is, 
for every bounded continuous function f : X 'K, 

(22) lim Eq J (Xn) ^ Eq fix) 

n — *oo 

Proof. Routine. □ 

Recall |6l that the law of the Dawson- Watanabe process with killing is absolutely continuous with 
respect to that of the standard Dawson- Watanabe process, with likelihood ratio given by pH)) . The 
likelihood ratio involves stochastic integration with respect to an orthogonal martingale measure; 
thus, the obvious strategy for proving (f2T|) in our context is to express the likelihood ratios of 
the spatial epidemic processes in terms of stochastic integrals against the OMMs of the branching 
envelopes. Although it is possible to work directly with the likelihood ratios of the epidemic processes 
to their branching envelopes, this is somewhat messy, for two reasons: (A) the offspring distributions 
TZn of the branching envelopes change with the village size N; and (B) the random mechanism by 
which particles of the branching envelope are culled in the standard coupling involves dependent 
Bernoulli random variables. Therefore, we will first show, by comparison arguments, that the spatial 
epidemic processes can be modified so that difficulties (A) and (B) are circumvented, and in such a 
way that asymptotic behavior is not affected. The likelihood ratios of the modified processes relative 
to critical Poisson branching random walks will then be computed in sec. [ 



2.2. Extent, duration, size, and density of the branching envelope. Since the spatial SIS 
and SIR epidemics are stochastically dominated by their branching envelopes, their durations, sizes, 
etc., are limited by those of their envelopes. For critical branching random walks, the scaling limit 
theorems of Watanabe and Feller give precise information about these quantities. Consider first the 
duration: Since the total mass in a BRW is just a Gallon- Watson process, if the branching random 
walk is initiated by A^" particles, then by a standard result in the theory of Gallon- Watson processes 
(Th. 1.9.1 of [2]), the time Tn to extinction scales like A^", that is, 

(23) Tn/N" =^ F. 
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The hmit distribution F is the distribution of the first passage time to by a Feller diffusion process 
started at 1. Consequently, under the hypotheses of Theorem[Tl the duration of the BRW is Op{N"). 
Furthermore, if is the mass in the nth generation (that is, the total number of particles), then 
by Feller's theorem ([13], ch. 1), 

(24) zC^^t]/N" =^ Zt, 

where Zf is a Feller diffusion process started at 1. Consequently, the total mass produced during 
the entire course of the branching envelope is of order Op{N^°'): in particular, 

/>OC> 

(25) Y.^nlN^'^=^ Ztdt. 

Jo 

Since the Feller diffusion is absorbed at in finite time almost surely, the integral is finite with 
probability 1. 

Next, consider the maximal density and spatial extent of the branching random walk. Watanabe's 
theorem implies that if initially all particles are located in an interval of size A^"/^ centered at 0, 
as required by Theorem [1] then the bulk of the mass must remain within Op{N"^'^) of the origin, 
because the limiting Dawson- Watanabe process has bounded support. A theorem of Kesten 
implies that in fact all of the mass remains within Op[N°'/'^) of the origin: that is, under the 
hypotheses of Theorem [1] if 

(26) r^:=max||x|: ^r,^(x)>ol, 

I t<oo J 

then for any e > there exists (3 < oo such that 

(27) P{Y;^ > /3iV"/2} < s. 

Together, (|23|1 and ([26|) imply that if initially the branching random walk has 0{N°') particles all 
located at sites within distance 0{N°'/^) of the origin, then the number of site/time pairs {x,t) 
reached by the branching random walk is Op{N^°'/'^). Now suppose in addition that the initial 
configurations satisfy the more stringent requirement X^{Q,x) X{Q,x) of Theorem[Tl then The- 
orem [5] implies that the renormalized density processes X^{t,x) ^ X{t,x), where X{t,x) is the 
renormalized density process of the standard Dawson- Watanabe process. Since X{t,x) is jointly 
continuous and has compact support [17], it follows that 

(28) m&xY^{t,x) ^ Op(iV"/2). 



2.3. Binomial/Poisson and Poisson/Poisson comparisons. In this section we show that, in 
the asymptotic regimes considered in Theorem |l] the Binomial- (iV, 1/A^) random variables used in 
the the standard coupling (scc. ll.6|) can be replaced by Poisson-1 random variables without changing 
the asymptotic behavior of the density processes X^{t, x). Recall that in the standard coupling, each 
particle, whether red or blue, produces a random number of offspring with the Binomial-(iV, 1/A'^) 
distribution. The total number of particles produced during the lifetime of the branching envelope 
is, under the hypotheses of Theorem lU at most Op{N'^°'), and a < 2/3 in all scenarios considered, 
by (|24|l . Consequently, if all of the Binomial- 1/A^ random variables used in the construction were 
replaced by Poisson-1 random variables, the resulting processes (both the red process, representing 
the spatial epidemic, and the red-|-blue process, representing the branching envelope) would be 
indistinguishable from the original processes, by the following lemma. 
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Lemma 1. Assume that under the probability measure iipf, the random variables Xi, X2, ■ ■ ■ are 
i.i.d. Binomial-{N,l/N), and that under measure v they are i.i.d. Poisson-1. Let m — be a 
sequence of positive integers such that for some e > 0, 

(29) mw = OiN^-") 

If Qm is the a— algebra generated by Xi,X2, . . . ^X^, then 

Proof. This is a routine calculation. Fix a sequence xi,X2T---,Xm of nonnegative integers; the 
likelihood ratio dfi n /dv of this sequence is 

For m = OiN"^-^), 

{(1 - N-^ fe^}"^ - exp{-m/2iV}. 
By Chebyshev's inequality and elementary calculus, for i^— typical sequences Xi, 



□ 



Recall that in the standard coupling fsec. II. 6p . red particles represent infections that occur in the 
spatial epidemic, whereas blue particles represent attempted infections that are suppressed because 
either two or more infected individuals try to infect the same susceptible simultaneously, or (in the 
SIR case) because the target of the attempted infection has acquired immunity by dint of an earlier 
infection. It is possible that more than one attempted infection is suppressed at once, that is, more 
than one blue particle with a red parent is created at a given site/time. In sec. 12.41 below, we will 
show that such occurrences are sufficiently rare that their effect on the epidemic process is negligible 
in the large- TV limit. To do so, we will bound the set of offspring of such blue particles by the set of 
discrepancies between a Poisson branching random walk with mean offspring number 1 + e and one 
with mean 1, for some small e. The next result shows that if e is sufficiently small relative to the 
size (total number of particles) of the Poisson branching random walk, then the effect of changing 
the mean offspring number is negligible. 

Lemma 2. Let jiK and uk be the distributions of Poisson branching random walks with mean 
offspring numbers 1 + Ek and 1, respectively, and common initial configuration [x) with K 
particles. If 

(31) SK = o{l/K) 
then under vk, as K 00. 

(32) ^ 1. 

dvK 

Proof. For a given sample evolution in which Zk particles are created, the likelihood ratio of fj,K 
relative to vk is 

^ = (1 + SKf'' exp{-eA-^A-} - 1 + 0{ZksI) 
dvK 

Under uk, the branching random walk will last on the order of K generations, during which on the 
order of K particles will be created in each generation, by Watanabe's theorem. Hence, Zk will 
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typically be of size K'^ . In fact, if is the number of particle creations in the nth generation, then 
is, under a Galton- Watson process with offspring distribution Poisson-1 and initial condition 
= K, so Feller's theorem implies 



where is a Feller diffusion process with initial state = 1- The assertion now follows from the 



The measures ^k^vk can be coupled as follows: Start with an initial configuration Yi^{x), as 
in the lemma, and let particles reproduce and move as in a branching random walk with offspring 
distribution Poisson-(l + ex). Attach to each particle C, a Bernoulli-eii:/(l + £_ff) random variable Uc^. 
Assign colors green or orange to particles according to the following rules: (A) Offspring of green 
particles are always green. (B) An offspring C, of an orange particle is green if = 1, otherwise 
is orange. Then the process of orange particles evolves as a branching random walk with offspring 
distribution Poisson-1, and the process of all particles, green and orange, evolves as a BRW with 
offspring distribution Poisson-(l + Ek)- Denote by y/^''~^{x) the number of green particles at site 
X, time t, and by X^'^ (t, x) the renormalized density function obtained from Y/^''^(x) by the rule 
(fT^ . with K. 

Corollary 1. Assume that the initial configurations Y(f^ {■) of the branching random, walks satisfy 
the hypotheses of Theorem\^ If £k — o{l/K), then as k —f oo, 



Proof. Denote by {t, x) the renormalized density process associated with the branching random 
walk of orange particles. By Theorem[2l the processes X^ X where X = X{t, x) is the density of 
a standard Dawson- Watanabe process. By Lemma [21 the density X^ + X^^'^ of the orange-|-green 
particle branching random walk also converges to the density of a standard Dawson- Watanabe 
process. Since there is only one Dawson- Watanabe density process, it must be that X^''~^ converges 
weakly to zero in sup norm. □ 

2.4. Multiple collisions. In the standard coupling fsec. ll.6p of a spatial SIS or SIR epidemic with 
its branching envelope, offspring of red particles at each site choose labels j G [N] at random, which 
are then used to determine colors as follows: (A) If an index j is chosen by more than one particle, 
then all but one of these are colored blue. (B) {SIR model only) If index j was chosen at the same 
site in an earlier generation, then all particles that choose j are colored blue. We call events (A) 
or (B), where offspring of red particles are colored blue, collisions. At a site/time where there are 
> 2 blue offspring of red particles we say that a multiple collision has occurred. In this section, 
we show that for either SIS or SIR epidemics, up to the critical thresholds (see the statement 
of Theorem [T]), the effects of multiple collisions on the evolution of the red particle-process are 
asymptotically negligible. In particular, this will justify replacing the standard coupling of sec. 11.61 
by the following modification, in which at each time/site there is at most one blue offspring of a red 
parent. 

Modified Standard Coupling: Particles are colored red or blue. Each particle produces a random 
number of offspring, according to the Poisson-1 distribution, which then randomly move either 
-|-1, —1, or steps from their birth site. Once situated, these offspring are assigned colors according 
to the following rules: 

(A) Offspring of blue particles are blue; offspring of red particles may be either red or blue. 

(B) At any site/time {x,t) there is at most one blue offspring of a red parent. 




hypothesis ([3T|) . 



□ 



(33) 




t.x 
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(C) Given that there are y offspring of red parents at site x, time t, the conditional probability 
i^Niy) = i^N,t.x{y) that one of them is blue is 

(34) KN{y) = y{y - 1)/(2A^) for SIS epidemics, and 

(35) KN{y) — yR/N for SIR epidemics, where 

(36) i? = <(:r)=^yf(2;). 

s<t 

Here Y/^ (x) is the number of red particles at site x in generation t, and so R — R^ (x) is the number 
of recovered individuals at site x at time t, equivalently, the number of labels j € [N] that have been 
used in the standard coupling at x by time t. Observe that in both the SIS and the SIR cases, the 
value of K,{y) is almost, but not exactly, equal to the conditional probability that in the standard 
coupling there would be at least one blue offspring of a red particle. The small discrepancies will 
make the expressions in the likelihood ratios (|5ip simpler. 

The next result will justify replacing the standard coupling of sec. 11.61 bv the modified standard 
coupling. 

Proposition 3. The standard couplings and the modified standard couplings can be constructed 
simultaneously in such a way that the following is true, for initial configurations satisfying the 
hypotheses of Theorem [J^ IfY^'^{x) is the discrepancy between the numbers of red particles at 
(x,t) in the standard and modified standard couplings, then under the hypotheses of TheoremUl as 
N ^ oo, 

(37) maxr^^^(t,a;) =op(Ar"/2)^ 

t,x 

Lemma 3. Let Bpf (B for "bad") be the number of sites/times {x,t) where there are at least 4 blue 
offspring of red particles in the standard coupling. Then as N ^ co, 

(38) Bn=op{1). 

Proof. Theorem [2] implies that, under the hypotheses of Theoremfl] the maximum number of parti- 
cles (of any color) at any site/time in the standard coupling is Op{N"'/'^). (Note: This also relies on 
the fact that the limiting Dawson- Watanabe density process X{t,x) is continuous and has compact 
support, w.p.l.) Hence, we may restrict attention to sample evolutions where y — yt{x), the total 
number of offspring of red particles at {x,t), satisfies y < CN"/^ for some fixed constant C < oo. 
Furthermore, by the considerations of sec. 12.21 we may restrict attention to sample evolutions of 
duration < CN" and spatial extent CN"^^. Since a < 2/3, it follows that the number of pairs (x, t) 
visited by particles of the branching envelope is no more than C^N . 

In order that there be at least 4 blue offspring of red parents at site x, time t, at least 4 pairs 
(possibly overlapping) of red-parent offspring must choose common labels j E [N] . The conditional 
probability of this happening, given the value of y = yt{x), is no more than Cy^/N"^, for some 
constant C not depending on y or N. But y < CN"/'^, so this conditional probability is bounded 
by C"N^°'~^ < C" N^^/^. Since there are only C'^N sites to consider, it follows that, on the event 
delineated in the preceding paragraph, the probability that Bjq > 1 is 0{N~^^^). □ 

Proof of Proposition\^ In this construction, each particle will be two-sided: the 5*— side will repre- 
sent the color of the particle in the standard coupling, and the Af— side the color in the modified 
standard coupling. A particle will be called a hybrid if the colors of its two sides disagree. The strat- 
egy will be to show that colors can be assigned in such a way that the process of hybrid particles is 
dominated by the green particle process of Corollary [1] the result ([57| will then follow from (|33p . 

Consider first the SIS case. Observe that in this case K^iy) is the conditional expectation, in 
the standard coupling, of the number of pairs that share labels, given that there are y offspring of 
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red parents at a site. Thus, K]\j(jj) exceeds (by a small amount) the conditional probability in the 
standard coupling that at least one of the y red-parent offspring would be blue. Denote by Ajv(?/) 
the excess; note that 

AN{y) = 0{yyN^). 

The rules by which the process evolves are as follows: First, all particles reproduce, each creating 
a random number of offspring with Poisson-1 distribution. Each offspring then moves +1,-1, or 
steps from its birth site, and chooses a random label j € [N]. Particles with "genotype" BB 
(that is, offspring of particles with coloring BB; the first letter denotes the S— color, the second 
the M— color) will always be colored BB, and their labels j will play no role in determining the 
colors of the other offspring. However, the labels of all other offspring matter. Say that there is an 
S — duplication at label j if at least two particles both with a red S"— gene choose label j; similarly, 
say that an M — duplication occurs at j if j is chosen by at least two particles with M— gene R. 
(Note: If both BR and RB genotype particles choose label j, only the RB particles are counted in 
the possible 5'— duplication, and only BR particles in the Af— duplication.) Particles at {x,t) are 
now assigned color "phenotypes" by the following rules: 

(D) (Default) If there is a duplication involving at least two particles of genotype RR, do the 
following: Among all such duplications, choose one (say i) at random; choose one of the 
geno type- particles with label i, assign it phenotype BB, and give all of the other particles 
with label i the same genotypes as their parents. Give all other particles at the site the same 
M— colors as their parents, and assign the remaining S*— colors by rule (S): 

(S) If there are labels j i with S*— duplications, then for each such label 

(a) If there is at least one particle with genotype RB in the duplication, then choose one 
of all such RB particles at random, give it phenotype BB, and give all other particles 
involved in the duplication S*— color R. 

(b) Otherwise, if all particles involved in the duplication have genotype RR, choose one at 
random and assign it phenotype BR, and give all of the rest phenotype RR. 

(c) Give all particles not involved in 5— duplications the same 5— colors as their parents. 
(M) If there are no duplications of type (D) but at least one Af— duplication, then in any such 

duplication, at least one particle of genotype BR must be involved. Choose one at random 
and assign it phenotype BB, give all of the remaining particles at the site Af— color R, and 
assign 5— colors by rules (S)-(a),(c). 
(A) (Adjustment Step) If there are no Af— duplications, then toss a Ajv(y)— coin: If it comes up 
Heads, choose one of the particles with genotype ?f? at random, give it Af— color B, and 
give all of the rest Af— color R. If it comes up Tails, give every particle the same Af— color 
as its parent. 

These rules guarantee that the 5— colors of the particles behave as in the standard coupling, 
and that the Af— colors behave as in the modified standard coupling. Therefore, the discrepancy 
y/^''^{x) is bounded by the number of hybrid particles at {x,t). Hybrids can be offspring of RR, 
BR, or RB particles, but not BB particles; however, a hybrid can only be produced by a particle 
of type RR if (i) there is a multiple collision, i.e., if there are at least two pairs of non-BB particles 
that choose the same labels; or (ii) the coin toss in the adjustment step (A) is a Head. Both 
of these are events of (conditional) probability 0(y^/iV^). Moreover, by Lemma [3l except with 
vanishingly small probability, there is no site/time {x,t) with more than 3 hybrid offspring of RR 
parents. Consequently, on the event that the maximal number of particles at any site/time is no more 
than CiV«/2 (see ^ above), the process of RR — >hybrid creations is dominated as follows: Let 
each particle, in every generation, produce an additional Poisson-2CA^'^"/^~^ offspring; immediately 
replace each such particle by 3 green particles, and let green particles only beget other green particles 
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in subsequent generations. Since a < 2/3, the rate at which green particles are produced by non- 
green particles is 0{N~-^), so Corollary [T] implies that the green particle process is asymptotically 
negligible. This proves (|37p in the SIS case. The SIR case is proved by a very similar construction. 

□ 

The upshot of Proposition [3] is that in proving Theorem [U the epidemic process (the red process 
in the extended standard coupling) can be replaced by the red process in the modified standard 
coupling. 

2.5. Orthogonal martingale measures and convergence of stochastic integrals. Let Y^''{x) 
be the number of particles at site a; e Z at time [t] G Z+ in a one-dimensional branching random 
walk with Poisson-1 offspring law TZoo- Denote by = ^kYkt corresponding rescaled measure- 
valued process. For each k, the measure- valued process satisfies a martingale problem analogous 
to that satisfied by the super-Brownian motion: If g C^, the (cadlag) process 

rlkt]/k 

(39) M,^-(0) (^f, ^) - (^0. <^> - / {X^^ ds 

Jo 

is a martingale, where Ak is the difference operator 

(40) Ak<l){x) = {(P{x + 1/Vk) + (j){x - 1/Vk) ~ 2<f>{x))}/3k-\ 

(Since Xt is constant on successive time intervals of duration k^^, the integral in ((39)) is really 
a sum.) The operator (f> i— > M''{(f>) extends to an orthogonal martingale measure M'^{ds,dx) (see 
[23| for the definition and basic stochastic integration theory). The measure M*^ is purely discrete, 
putting mass only at points {s,x) G k^^Z^ x k^^/'^'L: at such points s = n/k,x = m/^/k^ 

(41) fcM'=({s}, {x}) = Y^{m) - A^(to) where 

(42) \''^{m)^\\s,x):^E{Y^{m)\Hn-i)^\ J2 + and 

4=0,-1,1 

(43) Hn = cr({>"/(™) : m G Z, j < n}) 

is the (T— algebra generated by the history of the evolution to time n. Note that mass is scaled by the 
factor k, as required by the Feller- Watanabe normalization. Note also that, conditional on Hn-i, the 
random variables {Yj^{m)}m£Z are mutually independent: this implies that the martingale measure 
M'' is an orthogonal martingale measure. 

Proposition 4. Assume that the initial particle densities X''{0, •) satisfy the hypotheses of Theorem 
that is, they have common compact support and they converge in Cb(M) to a continuous function 
X{0,-). Then the random vectors {X^,M^) consisting of the density functions X^{t^x) and the 
orthogonal martingale measures converge weakly as k ^ oo to {X,M), where X is the Dawson- 
Watanabe density process (super-Brownian motion) with initial condition X{0,x), and M is its 
associated orthogonal martingale measure. 

Proof. Consider first the marginal distributions of the orthogonal martingale measures M'^ , viewed as 
random elements of the Skorohod space ©([O, oo), S'), where S' is the space of tempered distributions 
on R. In order to prove that M'' =^ M it suffices, by Mitoma's theorem (cf. [23 , Th. 6.15), to prove 
that (i) for any G 5 (where S is the Schwartz space of test functions), the processes M^{(f)) are 
tight, and (ii) finite-dimensional distributions M^^ipi) converge for all G S. Both of these follow 
routinely from the representation ([5^ and Watanabe's theorem: In particular, Watanabe's theorem 
implies that for any finite subset of S, 
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and ^ ^ 

\Ja / iei \Jo /, 



iei y-'o / iei 

where 

A = lim Ak = A/V&. 

k — >OQ 

Consequently, to deduce (i)-(ii) above it suffices to show that for each cj) — (p^^ 
f\xl \Ak(p - Acj,\) ds + f {Xl \A^\) ds ^ 0. 

Jo J[kt]/k 

Since \\Ak(l) — A(l)\\oo for Schwartz-class functions (p, this also follows from Watanabe's theorem. 

It remains to show that the convergence M'' ^ M holds jointly with X'=(t, x) ^ X{t, x). Since 
X^ =^ X (Theorem [5]) and M*"' M marginally, the joint distributions are tight. Hence, to prove 
that (X'^, M*') conveTge jointly, it suffices to show that the only possible weak limit is {X, M), where 
X = X{t,x) is the density process associated with the Dawson- Watanabe process. But because a 
continuous function w{x) is determined by its integrals against Schwartz-class functions </), it suf- 
fices to show that finite dimensional distributions of the vector-valued processes {M^(j)i, {X^, (j>i))i^j 
converge to the corresponding joint distributions of {Mt(j)i, {Xt, This follows by a repetition 

of the argument in the preceding paragraph. □ 

Corollary 2. Let 9(t, x, w) be a bounded, jointly continuous function of t > 0, x £ M., and w G 
D([0, 0), Cc(M)) such that for any t the function 6{t, x, w) depends only on w[0, t], that is, 9{t, x, w) = 
9{t,x,w[0,t]). Assume that the hypotheses of Proposition^ hold. Then 

(44) j j 9{s,x,X'')M^{ds,dx) JJ 9{s,x,X)M{ds,dx) 

Moreover, (j44[) holds jointly with the convergence X'' ^ X in D([0, oo), Cfc(R)). 

Proof. This can be deduced from Prop. 7.6 of [53], but verification of the hypothesis (7.5) is more 
work than a direct proof. The elementary Prop. 7.5 of [23] implies that weak convergence p4)) (and 
joint convergence with X'^ => X) holds for simple integrands 

n 

9{t,x,w) = ^a,(w)l(s.,t.](i)(/)»(a;) 
1=1 

such that (a) each € S, (b) each ai{w) is bounded, continuous in w, and Fg^— measurable (here 
(Fg) is the natural filtration on D([0, t], Cc(M))), and (c) none of the jump times Si,ti coincides with 
jumps of one of the martingale measures M'^. Clearly, any function 9 satisfying the hypotheses of 
Corollary [2] can be uniformly approximated by such simple functions. Consequently, to prove (I44p 
it suffices to show that for any e > there exists 6 — d{e) > such that for any simple function 9 
satisfying (a)-(c) and ||6'||oo < 



(45) P 



9{t,x)M''{dt,dx) 



> e} < e Vfc. 



To establish (|45|) we use the special structure of the orthogonal martingale measure . For 
each k, this is a purely discrete random measure with atoms (|41|) . By hypothesis, the conditional 
distribution of Y^(to) given the past is Poisson with mean (and therefore also variance) A^(m) (see 
(|42p ). and the random variables {y^'(rn)}mez are, for each fixed n, conditionally independent given 
the past. Hence, the predictable quadratic variation of the local martingale 

{9-M%:^ [ [ 9{s,x)M^{ds,dx) 

Jo Jx 
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IS 



[0.M'=]i = ^^l[o.,)GsMs,a:fA'=(s,x)/fc2, 

where the sum is over the jump points {s,x) G fc^^Z+ x k^^l'^'L of the martingale measure . 
Thus, if ll^lloo < <5, then the quadratic variation of (Q ■ M^^t is bounded by 

E E A' - E E ^'(-' ^) A' 

But Til ^) /^^ is just the total rescaled mass in the branching random walk, which by Watan- 
abe's theorem (or Feller's theorem) converges in law to the total mass in the standard Dawson- 
Watanabe process. The inequality ([45]) now follows routinely. □ 



Unfortunately, the functions for which we would like to apply this result — namely, those defined 
by equations (fT4|) and (fT3| - are unbounded. Worse, the function a;) = X{t, x) in equation p3|) 
isn't even continuous (as a function of X[t^ x) S ©([O, oo) x Cc(R)), because such functions may have 
jumps). The following corollary takes care of the first problem. 

Corollary 3. Assume that the hypotheses of Proposition^are satisfied. Let 9(t, x, w) — 9{t, x, w[0, t]) 
be a jointly continuous function oft > 0, a; £ M, and {w G D([0, oo), Cc(M)) such that for every scalar 
C < oo and every compact subset F of [0, oo) x R, 

(46) snp{\9{t,x,w)\ : sup |u'(t', a;')| < C and support (w) C F }< oo. 

t',x' 

Then (j44|) holds jointly with the convergence ^ X in ©([0, oo), Cb(M)). 

Observe that the hypothesis is satisfied by the function 6 defined by 

Proof. Because the Dawson- Watanabe density process X{s^x) is almost surely continuous with 
compact support, continuity of 9 ensures that 



0(s, X, XYX{s, x) ds dx < oo. 



and this in turn guarantees that 



(47) Jim 1 1 {9{s,x,X)AC)M{ds,dx)^ 1 1 9{s,x,X)M{ds,dx) 



and that the limit is a.s. finite. By the hypothesis ()46p . the function 9 is uniformly bounded on any 
set of sample evolutions uj = (xs) such that support(w) and sup are uniformly bounded. By the 
results of sec. 12.21 above, the supports and suprema of the random functions X'^(t, x) are tight, that 
is, for any e > there exist a compact set F = and a constant C = < oo such that for all k, 

P{support(X'') C F and maxX''{t, a;) < C} > 1 - e. 

Consequently, since the support of the martingale measure M'' is contained in that of X'', it follows 
that for any e > there exists C < oo such that for all k, 

9{s, X, X) M^{ds, dx) = J J {9{s, x, X) A C) M^{ds, dx) 

except on a set of probability < e. Weak convergence (j44|) (and joint weak convergence with 
X^' ^ X) now follows routinely from Corollary [2] and (|T71) . □ 

The function 9{t,x) = X{t,x)/2 that occurs in the SIS case of Theorem [1] is not continuous 
for functions X{t, x) in the space D([0, oo), Cc(K)), because such functions may have jumps. Thus, 
Corollaries [2]-[3] do not apply directly. The following corollary addresses this case specifically. 
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Corollary 4. Assume that the hypotheses of Proposition^ are satisfied, and define 9^{s,x) 
\^{s,x)/^/k, where \^{s,x) is as in (|42p above. Then 



(48) JJe''{s,x)M''{ds,dx)=> JJ X{s,x)M{ds,dx), 
and this convergence holds jointly with X . 

Proof. Although the random functions X^{t,x) and \^{t^x) have jumps, the jumps are, with high 
probability, small, because X'^{t,x) X{t,x), by Theorem [21 Consequently, it is still possible to 
approximate 0^ by bounded, continuous functions ip in such a way that the stochastic integrals of 
9^ relative to are well approximated, with high probability, by those of (p. In particular, define, 
for any e > and C < cxd, 

ip{s,x) = Lpc,e{s,x,w) = ^ JJ^, ^^^_^^^^_^^w{s' , x') A C ds' dx' ; 

x'^[x — 6,X-\-€] 

then for any S > there exist C, e such that for all fc, 

(49) P{max \9''{s, x) - ip{s, x, X'')\ > 5} < 5. 

s,x 

Clearly, ip{s,x,w) is jointly continuous in its arguments and uniformly bounded by C, so it meets 
the requirements of Corollary [2] It follows that for any C,e, 

ip{s,x,X^)M^{ds,dx) =^ jj ip{s,x,X))M{ds,dx), 

and this holds jointly with X'' => X. Thus, to prove the corollary, it suffices to show that if C and 
£ are suitably chosen then the differences between the stochastic integrals of 9'' and f against the 
martingale measures M'' are small with high probability, uniformly in k. 

By virtually the same calculation as in the proof of Corollary [3l the local martingale 

{9''{s,x)-ip{s,x,X''))M''{ds,dx) :=/"/" A''{s,x) M''{ds,dx) = (A*^ • M'=)t 

s<t JJs<t 

has predictable quadratic variation 

[A'= . M% ^Y.T. l[o,t)(5)A'=(s, x)'X''{s, x)/2 
< max A'=(s, xf ^ ^ y'^(s, x)/^. 

If £ > is sufficiently small and C < oo sufficiently large, then by inequality ([49]) . ||A'^||oo < <^ 
except with small probability, uniformly for all k. Recall that the sum ^ ^ Y''{s, x)/k'^ is the total 
rescaled mass in the branching process, and so by Feller's theorem converges in law to the total mass 
in the standard Dawson- Watanabe process. Hence, with high probability, the quadratic variation 
[A*^ • M'^Joo will be small, provided C and e are chosen so that 6 is small. Therefore, by standard 
martingale arguments, the maximum modulus of the stochastic integral (A'' • M'')t will be small, 
with high probability, uniformly in k. □ 

2.6. Likelihood ratios: generalities. The strategy of the proof of Theorem[l]is to show that the 
likelihood ratios of the (modified) spatial epidemic processes relative to their branching envelopes 
converge weakly to the likelihood ratios of the appropriate Dawson- Watanabe processes with killing 
relative to the Dawson- Watanabe process with no killing. The likelihood ratio of the Dawson- 
Watanabe process with killing is given by (fTU]) . This expression involves a stochastic integral relative 
to the orthogonal martingale measure of a standard Dawson- Watanabe process, and so to prove weak 
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convergence we will express the likelihood ratios of the epidemic processes in terms of stochastic 
integrals. 

Consider a sequence Y/^{x) of counting processes, and probability measures P — tQ = 
such that under P the process is a branching random walk with offspring law TZoo , and under Q 
it is a modified epidemic process (that is, the red particle process in the modified standard coupling 
of sec. I2.4|) . Assume that the initial conditions are common under P and Q, and satisfy the 
hypotheses of Theorem [T] Let Tit be the cr— algebra generated by the history of the evolution to 
time t, and set 

(50) \n^)--^Ep{Y,''{x)\nt-,) 

= {Y,i,ix - 1) + Y,^,{x) + Yi[,{x + l))/3. 

Under P, the random variables Y^ {x) are, for each t, conditionally independent Poisson r.v.s, 
given Ht-i, with conditional means A^(a;). In the modified standard couplings, the color choices at 
the various sites x are, conditional on 7it_i and on the numbers of red-parent offspring at the various 
sites, mutually independent, with at most one blue offspring of a red parent at any site/time. Hence, 
the event Y^^ (x) — y could occur in one of only two ways: (1) there are y offspring of red parents, 
none of which takes the color blue; or (2) there are y + 1 offspring of red parents, one of which 
is blue. It follows that the relative likelihood dQ^ /dP^ of a sample evolution {yt{x))t.x is 

given by 



(51) i^=nn^^(^'^) ^=nn 



p{y\X){l - KNiy))+p{{y + l)\X)KN{y + 1) 



where in each factor, y = yt{x) and A = A^(x), and KN{y) = K,M,t,x{y) is the conditional probability 
in the modified coupling, given y offspring of red parents at {t^x), that one of these is colored blue. 
Here p(-|A) is the Poisson density with parameter A. The conditional probability K^iy) is given by 
([M)) for SIS epidemics, and by ([55]) for SIR epidemics. 

Although the product (jSip extends over infinitely many sites and time x, t, all but finitely many 
of the factors are 1: In particular, if the nearest neighbors (in the previous generation) {x',t — 1) 
of site {x,t) are devoid of particles, then the factor Lisf{t,x) indexed by {x,t) must be 1. Recall 
(by ((23|) and ((27|) of sec. 12. 2[) that under the hypotheses of Theorem [U the number of site/time 
pairs {x, t) visited by the branching envelope is of order Op{N^°'/'^). Thus, the number of nontrivial 
factors in (|5ip is typically of the same order. 



2.7. Likelihood ratios: SIR epidemics. Analysis of the likelihood ratios is somewhat easier in 
the SIR case, in that the error terms are more easily disposed. This is because in the SIR case 
Theorem [T] requires that a < 2/5, and so the number of sites that contribute to the product (|5ip is 
Op{N^^^). In particular, error terms of order op{N~^^^) can be ignored. 

Recall that in the modified standard coupling for the SIR epidemic, the conditional probability 
KN{y) = KN,t.,x(y) that there is a blue offspring of a red parent at (t, x) is yR/N, where R = {x) 
is the number of labels used previously at site x (see Hence, the contribution to the likelihood 

ratio from the site {t,x), on the sample evolution Y^^ {x) = yt{x), is 



(52) 



LM{t, x) = l- yR/N + iX/{y + l)){y + 1)R/N 
= 1 - (y - X)R/N. 
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Thus, in the SIR case, the likehhood ratio ([5T|) can be written as 
(53) = nn(l - ((^"^ - Ar(x))i?f (:r)/7V) 

t X 



(1 + sm) exp I - E E ^t{^)Qn^) - ^ E E - 

\ t X t X 



\, t X t X 

where 

(54) Af(x) = (y,^(x)-Af(x))/7V", 

(55) (x) = i?f (a;)/A^i^", and 

(56) ejv = op(l). 

(Note: The error in the two-term Taylor series approximation to the logarithm is of magnitude 
Op{Y^R^/N^) = Op(A^9"/2-3), which is asymptotically negligible for a < 2/5; hence ((55|) .) 

The two sums that occur in the last exponential in (j53p are a stochastic integal and its correspond- 
ing quadratic variation, respectively. To see this, observe that the quantities (x) coincide with 
the masses (|4T|) in the orthogonal martingale measures associated with the branching random 
walks Y^. (In (gl]), fc = iV"; it makes more sense here to index by N rather than k.) Consequently, 

(57) EEAf(a;)^?f(a:)^ f f 0^ {t,x) {dt,dx). 
where 

0^(^,x) = ^>^4a;7V"/2)=ivW2-l r X''{s,x)ds 

Js=0 

For a < 2/5, this converges to zero as ^ oo; for a = 2/5 it coincides with 9(t,x) = 9(t,x, X^) 
as given by p^ . Therefore, Corollary [3] implies that 

(58) J I e^{t,x)M^{dt,dx) => jj 0{t,x)M{dt,dx), 

where M is the orthogonal martingale measure attached to the standard Dawson- Watanabe process 
and 6{t, x) is as in parts (c)-(d) of Theorem[T] (Corollary [3] also implies that the convergence holds 
jointly with X^ X.) Consequently, to complete the proof that 

(59) Lat =^ L = expjyy e{t,x) M{dt,dx) JJ 0{t,xYX{t,x) dtdx^ , 
it suffices to prove that 

(60) EEAf(x)2gf(x)2=^ ff0{t,xYX{t,x)dtdx. 

Proof of ((60| . This uses only Theorem [2] and a variance calculation. Define 

(61) ^Y^H^ti^YQ^^Y and 

t X 

(62) = Y,Y.^f{x)gnxY/N^-- 
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we will show that 

(63) A'^-B^ = op{l) 

Since (a;iV"/2)/7V"/2 X^{t,x) = op(l) as -> oo, by Theorem [2] (recaU ^ that A^(t,a;) 
is the average of the counts Y^i{x') over the neighboring sites x' — x,x± 1), 

^ j j X^{t,x)e'^{t,xf dtdx + op{l). 



Theorem [2] also implies that 

{t,x)e^ {t,xY dtdx II X{t,x)0{t,xy dtdx. 



Therefore, proving (|63|) will establish ((60)) . 
For any constant C < oo, define 

rc = r^f = max{t : maxmaxK^, (x) < CTV"/^}, 

X s<t 

and let and Bq be the restrictions of the sums to the range t < tc /\ CN" and 

|a:| < CN"/^. Note that tc is a stopping time, and that Af (a;) < CiV"/^ on the event t < tc- 
Since the range of summation in ([61]) and ([62)1 is limited by ([23]) and ([27]) , for any e > there exists 
C ~ Ce < oo, independent of N, such that 

= and B^^S^ 

except possibly on events of probability < e. Thus, it suffices to prove ([63]) with ,B^ replaced 
by Aq,B^. For this, we use the fact that the offspring counts Y^^ {x) have conditional Poisson 
distributions (given the past Ti-t-i) with means \^{x)): This implies that the conditional means 
and conditional variances coincide, and that 

(64) Ep{{{Y,^ix) - Af (x))^ - Af (x))^ = 2Af (x)^ 

For t < Tc, the right side is bounded by 2C^N°'. Furthermore, for t < tc /\C, 

Thus, the conditional variance of each term in the sum A^ — B^ is bounded by 2(7^" A^^"^'*. Since 
the number of nonzero terms in the sum is C(2C + l)iV3"/2, it follows that 

But a < 2/5, so this converges to as ^ oo. □ 

2.8. Likelihood ratios: SIS epidemics. For SIS epidemics. Theorem [T] requires that a < 2/3, so 
by and (P7| . the number of sites/times (a;, t) that contribute nontrivially to the likelihood ratio 
product ([5T|l is Op{N^°'/'^) = Op{N). Thus, error terms of magnitude op(iV~^) can be ignored in 
each factor. 

In the modified standard couplings for SIS epidemics, the conditional probability that there is 
a blue offspring of a red parent at {x,t), given y red-parent offspring in total at {x,t), is KNiy) = 
y{y - 1)/2N (see Hence, 

= l-yiX-y-l)/2N 

= 1-X{y- X)/2N -iy~ \f /2N + y/2N. 
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Since terms of order op{N ^) can be ignored, ([5T|) can be written as 

(65) Ln^{1 + op(l)) \{ Yl exp {-Ajv(i, x) - B^it, x) - CN{t, x)/2} 



t>l x& 

where 



ANit,x) = X{y-X)/2N; 
BN{t,x) = X^{y-X)y8N^; 
CN{t,x) = {y-Xf/N^y/N. 

Here we continue to use the convention y = yt{x) and A = X^ {x)^ as in ([51]) . Hence, to prove the 
convergence ([5^ => L (jointly with that of X'^{t, x) ^ X{t, x)), it will suffice to show that 

(66) ^AN{t,x)=^ 0{t,x)M{dt,dx); 

t.x 

(67) ^BN{t,x)=^ e{t,xfX{t,x)dtdx; and 

t,x •' 

(68) Y.^CN{t,x){t,x))^G 



t,x 



where 6{t,x) = for a < 2/3 and 9{t,x) ~ X{t,x)/2 for a = 2/3, as in parts (a)-(b) of Theorem[T] 

Proof of ([66]) . This is virtually identical to the proof of the analogous convergence in the SIR case. 
The increments (x) := {Y/^ (x) — X^{x))/N°' are the masses (|4T|) in the orthogonal martingale 
measures M^, and so 



J2AN{t,x)= jj6^{t,x)M^{dt,dx), 



where 6'^(t, x) = X%^t{xN°'/'^)/2N'^-°' . If a < 2/3 then maxt,^ 9^ {t, x) -> in probability, whereas 
if a = 2/3 then 9^{t,x) meets the requirements of Corollary IHThus, the convergence ((66|) follows 
from Corollary m □ 

Proof of ([67]) . This proceeds in the same manner as the proof of (f60|) in the SIR case, by showing 
that the term B]^{t,x) in the sum ([57)1 can be replaced by B'j^{t,x) :— X^/8N^. To do this, we 
truncate the sum of the differences in exactly the same way as in the proof of ([60p , using the same 
stopping times tc = Tq . Note that the truncated sum once again has C(2C + l)iV'^"/^ terms, and 
that in each term, A < CN"^'^. Using the conditional variance formula ([64]) . we find that after 
truncation, 

E (j2Y.^BNit,x) - B'j^{t,x))y < C'iV3"/27Vf5"/V^^ < C'N-\ 
since a < 2/3. Consequently, ^ ^ Sjv can be replaced by ^ ^ B'^. But 

^5]B^(t,x)-^5]Af(a;)V8Af2^ f f X{t,x)9{t,xf dxdt 

t X t X 

by Theorem [2] and the definition of 6 (parts (a)-(b) of Theorem [1]) □ 

Proof of ([55)) . This is based on a variance calculation similar to those used to prove ([57)) and ([M)) . 
Note first that the terms C^it, x) constitute a martingale difference sequence relative to the filtration 
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Tit, because the mean and variance of a Poisson random variable coincide. The conditional variances 
of the terms CN{t,x) can be estimated as follows: If Y ^Poisson(A) with A > 1, then 

E{iY - Xf - E{{Y - A)2 - \f + E{Y - Xf + 2E({Y - Xf - \){Y - A) 

< 2A2 + A + 2\/2A2A 

< bV2X^. 

Now truncate the sum 

E E CN{t, x) as before (that is, t<Tc^ CiV" and \x\ < CN°'>'^): Since the 
number of terms is C"iV^"/2 and A < CN"/"^ for each term, the variance of the truncated sum is 
bounded by C" N \^ / N"^ . For a < 2/3, this converges to zero as ^ oo. □ 

3. Weak Convergence in D([0, cx)), C6(M)): Proof of Theorem [2] 

In this section we prove Theorem [5] by verifying that the sequence of random functions X'^^{t,x) 
is tight, provided the hypotheses of Theorem [5] on the initial densities hold. 

3.1. Moment Estimates. The proof will be based on moment estimates for occupation counts of 
a branching random walk started by a single particle located at the origin at time 0. Denote by 
Yn{x) the number of particles at site a: G Z at time n G Z+, and by E^,P^ the expectation operator 
and probability measure under which the branching random walk is initiated by a single particle 
located at x. For notational ease set 

(69) ip{x) = exp{— x^}, 

ipn{x) = exp{-a;^/?i}, 

^n{x,y) = (Pn{x) +(pn{y)- 

Proposition 5. There exist finite constants Cm, Pm such that for all m, rt G N, all a;, y G Z, and all 



The (somewhat technical) proofs will be given in sections l3 . 5H3. 71 below. The exponents m/5 on 
a and \x — y\/^/n in (I72|) and ((7T|) arc not optimal, but the orders of magnitude in the estimates 
make sense, as can be seen by the following reasoning: The probability that a branching random 
walk initiated by a single particle survives for n generations is 0(n^^), and on this event the total 
number of particles in the nth generation is 0{n). Thus, on the event of survival to generation n, 
the number of particles Yn{x) at a site x at distance -y/n from the origin should be 0(y/n). This is 
consistent with mth moment of order 0{n~^n"^/^). 

3.2. Tightness in ]D)([(5, oo), Cf,(R)). The proof of tightness will be broken into two parts: In this 
section, we will show that for any (5 > 0, under the hypotheses of Theorem [21 the density processes 
X*^(t, a:) restricted to the time interval [5, oo) are tight. Using this and an auxiliary smoothness 
result of Shiga [22] for the Dawson- Watanabe process, we will then conclude in sec. 13.31 that the 
density processes X^{t, x) with t restricted to some interval [0, 5] are tight. 

Standing Assumptions: In sections l3. 21 - 43. 31 Y^{x) will be a sequence of branching random walks 
satisfying the hypotheses of Watanabe's theorem, and X^{t,x) will be the corresponding renormal- 
ized density processes, defined by equation ([T5|). In addition, assume that the initial configurations 



a G (0,1) 



(70) 
(71) 
(72) 



E^'Ynixr < C„n-in"/V„(/3mx), 
\E°iY^{x) - y„(2/))"| < C„n-in'"/2|(a; - y)/V^r"/5$„(/3a;, /Jy), 
|^"(r„(a) -y„+[„„](a))™| < C„,n-in'"/2a™/V„(/3™x). 
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are such that for every fc, the initial particle density function X* (0, •) has support contained in J, 
for some fixed compact interval J. 

Proposition 6. There exist constants C = C„,p < oo and /3 > such that for every fc = 1, 2, . . . , 

all x, y e M, and all s,t €z [l/n, n], 

(73) E\X''{t,x)^X''{t,y)\P <C\x~if+^{(p{f3x)+(p{(3y)) and 

(74) E\X\t,x) - X^{s,x)\P < C\t- s\^+^ip{(3x). 

Note. Under the conditions of Watanabe's theorem, the bounds ([73|) -([74 |) will in general hold only 
for t, s away from zero: in fact, if Xq is singular then the densities X''{t, x) will blow up as t ~> 0. 

Proof. The inequalities ((73)) - ((74|) follow from inequalities ((7T|) - (|72|) . respectively, with p = m > 12. 
To see this, observe that the moments in ((73)) -([74 | can be related to the corresponding moments for 
branching random walks Y started from single particles located at points x G jVk- If nk := ll^^J^H 
is the number of particles in the initial configuration of the kth BRW then for any even integer m, 

m r 

(75) E\X\t,x)-X\t,y)r^Y. E E X{k-^^"E-^{Y^,,^{./kx) -Y^,,^{^y)r^ 

where Vrim) is the set of all integer partitions of m with r nonzero elements rrii, the inner sum is 
over all choices of r particles from the particles in the initial configuration, and Xi is the location 
of the ith particle. Since TkYQ ^ Xq, the masses nj. must be asymptotically proportional to k, so 
the number of choices in the inner sum is < Ck^ , for some constant C independent of k. Since the 
initial configurations are all restricted to lie in ^/kj, with J compact, the bounds implied by ([7T|) 
for E'^' are comparable to those for E'^, after changing to (3m/'^, because the Gauss kernel ip 
satisfies Lp{x — y) < Cjip{x/2) for all a; e M and y £ J- Thus, by ((7T|) . 

m r 

RHSI\7B) <CY1 E k''Y[k-^\x~yr^^'^^i{Px,f3y), 
provided t is bounded away from and oo. This clearly implies ([751) . ^ similar argument gives 

dZll). □ 

Corollary 5. Assume that all of the measures TkY^ have supports C J , where J is compact, and 
assume that the hypothesis ([7]) of Watanabe's theorem holds. Then for every 5 > 0, the random func- 
tions {X''{t, x)}t>s converge weakly in the Skorohod space B([(5, oo), C;,(M)) to the Dawson- Watanahe 
density process {X{t,x)}t>s restricted to time t> 5. 

Remark 1. Observe that Corollary [5] holds even for initial conditions Yq{-) whose Feller- rescalings 
^kYQ converge to singular measures. 

Proof. Since the associated measure- valued processes converge to the Dawson- Watanabe process, it 
suffices to show that the sequence {X^{t,x)}t>s is tight. This follows from inequalities ([73 l) -([74 |) by 
the usual Kolmogorov-Chentsov argument. (See, for instance, [3], Th. 12.3, or [TS', Problem 4.11 for 
the one-parameter case. Here, since there are two parameters t,x, the exponent must be > 2.) □ 

3.3. Tightness in D([0, cx)), Cb(R)). It remains to show that under the stronger hypotheses of 
Thcorem[2]the rescaled particle densities X^{t, x) are tight for t G [0, oo). It is possible to do this by 
estimating moments, but this is messy. Instead, we will use a soft argument, based on an estimate 
for the Dawson- Watanabe process proved by Shiga ([22], Lemma 4.2): 
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Lemma 4. Let X{t, x) be the density of a standard Dawson-Watanabe process with initial condition 
X{0,x) = /(x) and variance parameter . Define 

(76) N{t,x)=X{t,x)-Gtfix) 

where Gtf is the convolution of f with the Gaussian density of variance a^t. For any compact 
interval J there exist constants Ci, C2 such that for all £, T > 0, if f <: Plj then 

(77) P{\N{t, x)\ > e/3e-(^-*)l^l for some t < T/2 and xeR}< Gie'^'^ expl-Cae^T-^/^}. 

Consequently, for any a > 1, e > 0, and compact interval J there exists T > such that if f < Plj 
then 

(78) P{X{t, x) > a(3 for some t < T and x e R} < e. 

The strategy now is to use Lemma 2] to deduce a maximal inequality for the density X'^{t, x) of a 
branching random walk over a short time interval t £ [0, S]. For this, we use the weak convergence 
result of Corollary [5] together with monotonicity of the BRW in the initial condition. Note that 
adding particles to the initial configuration of a BRW has the effect of augmenting the original BRW 
by an independent BRW initiated by the set of new particles. Thus, for any two initial particle 
configurations Y^^'^ and Y^'^ whose discrepancy satisfies 

\Y^^^{x)-Y^^''{x)\<l3kX^,,{x) 

there exist coupled branching random walks Y^'^ and Y^'^ with initial conditions Y^^'^ and Y^'^ 
whose difference is bounded in absolute value by a branching random walk Y^''^ with initial condition 

Lemma 5. For any e > and compact interval J there exist (3,T > Q such that if 

(79) Y^<l31rj:j then P{supsupX''(i,a;) > e} < e. 

t<T a:eR 

Proof. Consider first a sequence of branching random walks with initial densities X^{0, ■) = 2/31 j*, 
where J* is a compact interval containing J in its interior. By Corollary [51 for any T > the 
random functions X^{T + t, x) (with t > Q) converge to the density X{T + t,x) of a standard 
Dawson-Watanabe process with initial density X(0, x) — 2(31 j* [x). (Note: The variance parameter 
here is = 2/3.) Hence, by Shiga's Lemma IH if T > is sufficiently small then 

(80) P{miiiX^{T,x) < 13} <e and P{ sup supX'=(t, x) > 4/3} < e. 

^e-^ te[T,2T\xewi 

It follows that for each fc, the particle configuration at time T is such that the renormalized density 
X^{T^ ■) exceeds except on an event of probability < e. 

Now consider branching random walks with initial densities X^{Q,-) = f31j. By monotonicity 
in initial configurations, the corresponding particle density processes X^{t^x) are dominated by the 
density processes X^{T + t,x) of the preceding paragraph, except on events of probability < e. 
Consequently, by the second inequality in ([50]) and the Markov property, 

P{supsvcpX''{t,x) > 4/3} < 2e. 

t<T X 

□ 

Proof of Theorem\^ Assume now that the initial particle densities /fe(-) := X^{Q,-) satisfy the 
hypotheses of Theorem O in particular, all have support contained in the compact interval J, and 
fk ~* f uniformly for some continuous function /. By Corollary [5l for every T > the density 
processes X'^{T + t,x) converge weakly to X{T + t,x), where X{t,x) is the density process of the 
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Dawson- Watanabe process with initial density /. By Kesten's Theorem [TB], e > there is a compact 
interval J* D J such that for every T > and k G N, with probability at least 1 — £, the function 
X''{T, •) has support contained in J*. By Shiga's Lemma, for any e > 0, if T > is sufficiently 
small, 

P{snp\X''{T,x) - X''{0,x)\ >e}<s. 

X 

Hence, by Lemma[5]and a comparison argument, it follows that for any e > 0, if T > is sufficiently 
small, 

P{supsup|X*=(r + t,a;) -X*=(i,a;)| >e}<e. 

t<T X 

Tightness of the sequence {X''{t, a;)}f>o now follows from CoroUaryO □ 

3.4. Proof of Proposition [5} Preliminaries. The proofs of the estimates ([7D|) - ([7T]) are based on 
a simple recursive formula for the mth moment of a linear functional '^^Yn{x)^p{x) of 

the particle density at time n. Observe that, for any integer m > 1, the mth power (y„, tp)™ is the 
sum of all possible products YYiLi V'l^^i), where Xi is the location of a particle in the nth generation 
(particles may be repeated). For any such product, the r < m particles involved will have a last 
common ancestor (LCE), situated at a site z in the fcth generation, for some k < n. There are two 
possibilities: either there is just r = 1 particle in the product, in which case it is its own LCE and 
= n, or there are at least two particles, in which case the LCE belongs to a generation k < n. 
Conditioning on the generation k and location z of the LCE leads to the following formula (for 
bounded functions ip): 

n—l m 

(81) i^^(r„,^)™ = ^P"(x,z)^(z)™ + ^^P'^-(2;,^)E^- E Fn-k{t.m-z) 

z k=0 z r=2 meVrim) 

where 

(82) ¥{x,y)^^l{\x-y\<l} 

is the transition probability kernel for nearest neighbor random walk with holding, Kr is the rth 
descending factorial moment of the offspring distribution (the expected number of ways to choose 
r particles from the offspring of any particle), 7',.(m) is the set of all integer partitions of m with r 
nonzero elements m.i, and 

r 

(83) F„(^;m;z):=3-'^ ^ 1[E^+^^ {Y^,^;)^^ . 

ji=0,l,-l i=l 

Notice that in each term of the second sum in (ISTj) . r > 2, reflecting the fact that these terms 
correspond to final products with r > 2 distinct particles, in which the individual particles are 
repeated rrii times. Since = ^ ^^'^ > 1, it follows that < m: this is what 

makes formula ((8T|) recursive. The formula (j83p accounts for the possibility that the r offspring will 
jump to random sites adjacent to the location z of the LCE. The appearance of powers P" of the 
transition probability kernel of the nearest neighbor random walk with holding derives from the fact 
that the branching random walk is critical, so that the expected number of descendants at (n, z) of 
an initial particle at (0,a;) is P"(a;, z). Following are standard estimates that will be used to bound 
such transition probabilities. 
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Lemma 6. There exist constants C, /? < oo such that for all x,y lE n G N, and < a < 1, 

(84) P" (0, x) < Cn~^^^(pn{f3x), 

(85) |P"(0,a;) - P"(0,z/)| < C{n-^/^\x - y\ A l)<i>„(/3x, /3y), 

(86) |P"(0,a;) -P"+""(0,x)| < Cn-^^^a^n{Px). 

Proof. The first inequality follows from the local limit theorem and standard large deviations esti- 
mates for simple nearest-neighbor random walk. The second and third inequalities use also the fact 
that the Gauss kernel (p{x) is uniformly Lipshitz in x. □ 

Finally, we record some elementary inequalities for Gaussian densities: 

Lemma 7. For any /3 > there exists C < oo such that for all x E n > 1, and k > n/2, 

(87) ^^„(/3y)<CV^; 

yei. 

(88) Y.^'^iM^'^-kiPx- Py) < CV^r^^n{f3x/A); 

(89) exp{-/3a;V4n} < C,} exp{-/3(x ± 1)^2"}- 

3.5. Proof of (iTOl) . The proofs of the inequalities in Proposition [5] will proceed by induction on 
the power to. In cases ([TOjl - dTT]) , the starting point will be formula ([5T|) : we will use the induction 
hypothesis to bound the factors in the products ([55)1 . In each case it will be necessary to analyze 
terms separately in the ranges k < n/2 and k > n/2. To prevent a proliferation of subscripts, we 
will adopt the convention that values of constants C, /3 may change from one line to the next. In 
particular, in each inductive step we will relax the constant (3 (from f3 to /3/2 or /3/4) to account for 
differences ±1 in the arguments of exponentials: this is justified by ([89]) in Lemma [7] above. 

For the proof of ([70|) . use the function (p — 6x in formula ((8T|) . When to = 1, the terms indexed 
by < A: < n in jHl]) all vanish, leaving E^Ynix) = P"(0,a;). Thus, the inequality ^ follows for 
TO = 1 directly from Lemma [51 Assume now that (|70p holds for all powers < to, where m > 2. By 
([5T|) and the induction hypothesis, 

n— 1 r 

S°l;^(a;)" <P"(0,x) + C^^P'=(0,z) J2 l[iin-k)-^+"''^^ip,,^k{l3{x-z))) 

k=0 z meVim) i=l 

where V{m) = U™2^i'('^) the set of all partitions of m with at least two nonzero elements to^. 
(Note that the constants have been absorbed in C.) The initial term P"(0,a;) has already been 
disposed in the case m — I (since the right side of the inequality (|70p is nondecreasing in m, provided 
Cm T f3m i). Thus, we need only consider the terms k < n of the second sum. 

Consider first the terms k < n/2. For indicies in this range, n — k > n/2, and so the factors in 
the inner products can be handled by simply replacing each n — k hy n (at the cost of a constant 
multiplier): Since the number r of factors in each product is at least 2, and since the exponents 
in eaach interior product sum to m, 

^ <Cn-2+™/2 ^ J2f''{0,z)^^{(3{x-z)) 

k<n/2 k<n/2 z 

< Cn-i+™/Vn(/32:). 

Now consider the terms k > n/2. For such terms, n — fc is no longer comparable to n, and so 
the factors in the interior products cannot be estimated in the same manner as in the case k < n/2. 
However, when k > n/2 the transition probability P'^(0, 2) can be estimated using the local limit 
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theorem (IM|) : hence, using the convolution inequality ([55)1 of Lemma [7] and the induction hypothesis, 
and once again that each interior product has at least two factors, 

^ <Cn-i/2 J2 Y.^k{Pz){n~k)-^+"''\,,^k{P{x-z)) 

k>n/2 k>n/2 z 

k>n/2 

< Cn-i/V„(/32:)n'"/2-i/2^ 

as desired. This proves inequality ([70|l . □ 

3.6. Proof of ([7T|) . This is also by induction on m. For notational ease, set ^p^y — Sx — Sy, where 
6z is the Kronecker delta, and set 

dn{x,y) = {\x - y\/Vn) A 1. 

Consider first the case m — 1. By formula (|5T|) and the local limit bound (ISS)) . 

(90) \E''{Yn,^xy) I - |F"(0, x) - P"(0, y)\ < Cn~'^^dn{x, y)^n{f3x, f3y) 

for suitable constants C, /3. Inequality ((7T|) for m = 1 follows easily. 

Assume then that inequality (ffTj) is valid for all positive integer exponents less than m. The first 
sum on RHS (|5T|) (the terms with k = n) can be bounded above by Cn~-'-/^{(pn{Px) + (fn{Py)) using 
Lemma [HI and this in turn is bounded above by RHS(|7ip. Thus, we need only consider the second 
sum on RHS (|8ip (the terms k < n). In each of these terms, the interior products ((83)) have at least 
two factors, each with > 1, and in each of these products the sum of the exponents rrii is m. 
Consequently, by (|8T)) and the induction hypothesis. 



n—1 



^ < C ^ ^ P'=(0, z)|x - yr/'in - fc)-2W2-Wio<j,^^_^(^^ _ py _ p^y 

k=0 k=n z 

Consider first the terms in the range k < n/2: for these terms, n — k > n/2, and so ^n-k is 
comparable (after a relaxation of /3) to $„. Hence, 

n/2 n/2 

^ < Cn-2^™/2-™/i0|2, _ y\m/5 J2 P^(0, z)$„-fc(/3x - /3z, Py - /3z) 

k=l k=l z 

< Cn-in™/2-™/i0|2. _ y|W5$^(^2:, Py), 

as desired. Now consider the range k > n/2: Here we use the local limit estimate for P'^(0,z) and 
the Gaussian convolution inequality ((88)) to obtain 



<C\x-yr^' J2 5]P"(0,z)$„_fe(/3a;-/3z,/3y-/3z)(n-fc)-2+W2-Wio 

fc=n/2 k=n/2 z 

n-1 

<Cn-^/^\x-yr/'<fn{Px,(3y) ^ _ fc)-3/2+m/2-™/io 

fc=n/2 

The final inequality relies on the fact that the exponent — l/2 + rn/2 — rn/10 is positive for all m> 2. 
This proves ([71]). □ 
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3.7. Proof of ([7^ . The moment formula ([5T|l can no longer be used, since the expectation on 
LHS ([7^ involves the state of the branching random walk at two different times. However, it is not 
difficult to derive an analogous formula: For any integer m > 1, the mth power {Yt{x) — Ys{x))™ 
is a sum of products YViLi 4'ixi)i where each Xi is the location of a particle in either the tth or sth 
generation, and ip{xi) = ±Sx{xi), with the sign ± depending on whether the particle is in the tth 
or sth generation. As in formula (|8ip . the particles involved in any such product must have a last 
common ancestor in some generation k before the s A tth generation. Conditioning on the generation 
and location of this last common ancestor leads to the formula 

(91) Ey{Ys{x)-Yt{x)r =P'(y,^) + (-i)"P*(y,x) 

sAt m 

k=0 z r=2 meVrim) 

where Kr and Vrijn) have the same meanings as in the moment formula (jSip and 

r 

(92) G,,t(z;m) = 3-'- ^ [] i?^"' (^(z) - rt(z))"^ 

ji=0,l,-l i=l 

We will use to prove ([7^ by induction on the power to, using arguments similar to those 
used in proving (|70p and ((7T|) . Consider first the case to = 1: in this case the last sum in ((9T|) 
vanishes, leaving 

(93) E{Y^{x) - Y^+U^)) - P"(0, x) - P"+""(0, x). 

Hence, inequality ((72| follows immediately from estimate ((86)) . 

Assume now that (j72l) holds for all positive integer exponents smaller than to, for some integer 
TO > 2. To prove that (|72p holds for the exponent to, consider RHS ipij) . with s ^ n and t = n + an. 
The sum of the first two terms coincides with (1931) . To verify that this sum, in absolute value, is 
smaller than RHS dT^ . observe that by the sum is smaller than Can~^/^(/5„(/3a;); since an must 
be an integer, a > n~^, and so 

n^/^a < 

for all TO > 2, as required by ([72]) . Thus, it remains to prove that the sum of the terms with r > 2 
in dnil) is also bounded by RHS ((72|) . 

Note first that it suffices to consider values of a < 1/2, because for 1/2 < a < 1 the factor a™/^ 
on RHS ((72l) is bounded below. Now split the terms of the last sum in ((9T|) into three ranges: first 
k < n/2, then n/2 < k < n — na, and finally n — na < k < n. In the range k < n — na, the 
induction hypothesis applies, because for these terms a' — na/{n — k) < 1. (Recall that one of the 
hypotheses of Proposition [S] is that a < 1, so to use (17^ in an induction argument the implied a' 
cannot exceed 1). For k < n/2, the ratio n/{n — k) is bounded above by 2; consequently, 

E^'(0'^)""'^"^'""^'^"-'»(/3x-/3z) 

k<n/2 k<n/2 z 

k<n/2 

< Cn-i+"/2a™/Vn(/52;), 
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which agrees with RHS (|72p . Next, consider terms in the range n/2 < k < n — na: By the induction 
hypothesis, the local limit bound (|84p . and the Gaussian convolution inequality ([551) . 

n~na n—na 

E E Y.^HO,z){n- ky^+'^/^ian/in - k)r^''^n-k{l3{x - z)) 

n—na 

< Cn-l/2n"/^a™/Vn(/3a;) ^ _ ^)-3/2+rn/2-m/5 

k=n/2 
n/2 

The last inequality uses the fact that the exponent —1/2 + m/2 — to/5 is positive for all m > 2. 

Finally, consider the terms in the range n — na < k < n. Here the induction hypothesis cannot 
be used, because na/n — k > 1. Instead we use the bound ([TO]), which we have already proved is 
valid for all m. This, together with (|84p and ([55)1 . implies 

n— 1 n— 1 

E E E^'(o,z)(n-fc)-2+"/2<^^^(^3._^^) 

k—n—na k—n—na z 

n-1 

< Cn-i/Vn(/3x) {n~k)-^/^+'^'^ 

k—n—na 

< Cn-i+"/2a-i/2+-/2<^^(^^) 

This completes the proof of ([7^ . □ 

Acknowledgments. The author thanks Regina Dolgoarshinnykh and Xinghua Zheng for useful 
discussions. 
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